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The behavior of an atom in a molecule, liquid or solid is governed by the force it experiences. If the 
dependence of this vectorial force on the atomic chemical environment can be learned efficiently with 
high-fidelity from benchmark reference results—using “big data” techniques, i.e., without resorting 
to actual functional forms—then this capability can be harnessed to enormously speed up in silico 
materials simulations. The present contribution provides several examples of how such a force field 
for A1 can be used to go far beyond the length-scale and time-scale regimes accessible presently 
using quantum mechanical methods. It is argued that pathways are available to systematically and 
continuously improve the predictive capability of such a learned force field in an adaptive manner, 
and that this concept can be generalized to include multiple elements. 


The dynamic behavior of an atom in a molecule, liquid 
or solid is directly determined by the local force it experi¬ 
ences. Nevertheless, as already pointed out by Feynman 
[1], forces are generally viewed as secondary computed 
quantities and are obtained through the agency of the 
total potential energy—a global property of the entire 
system. In practice, forces on atoms are obtained ei¬ 
ther as by-products during a potential energy evaluation, 
or from the first derivative of the potential energy with 
respect to the atomic positions. This outlook still per¬ 
meates modern materials simulation efforts, regardless 
of whether one adopts a quantum mechanical or (semi- 
)empirical prescription for energy and force predictions. 
Direct and rapid access to atomic forces, given just the 
atomic configuration of a system (molecule, liquid, or 
solid), immediately makes it possible to perform efficient 
geometry optimizations, molecular dynamics (MD) sim¬ 
ulations, and a host of other related and relevant simu¬ 
lations. If the capability to predict forces preserves the 
fidelity of high-level quantum mechanics based methods, 
but comes at a minuscule fraction of the cost, and if 
this capability can be extended systematically and pro¬ 
gressively to potentially all configurational and chemical 
environments that an atom may experience, we will have 
a powerful and adaptive materials simulation scheme. 

The present contribution lays the ground work and 
takes initial steps towards the above vision. A new 
scheme is presented that systematically learns in an in¬ 
terpolate manner to predict atomic forces in environ¬ 
ments encountered during the dynamical evolution of ma¬ 
terials from a set of reference atomic configurations and 
high-level calculations. This concept is resonant with 
emerging data-driven (or “big data” [2-6]) approaches 
aimed at materials discovery in general [7, 8], as well 
as at accelerating materials simulations [9-13]. Machine 
learning (ML) methods using neural networks [9, 10] and 
Gaussian processes [11, 12] have been successful in the 
development of interatomic potentials, wherein the po¬ 
tential energy surface is learned from a set of higher-level 
(quantum mechanics based) reference calculations. 


The distinctive aspect of the present contribution, 
namely, learning to predict atomic forces directly (rather 
than the potential energy) from past data is far more 
powerful, but has been suggested only recently [12, 13] 
(to accelerate ab initio MD simulations on-the-fly). Here, 
we propose the creation of a stand-alone purely data- 
driven force prediction recipe devoid of any explicit func¬ 
tional form. This force field is adaptive (i.e., new congu- 
rational environments can be systematically incorporated 
as required), generalizable (i.e., the scheme can be ex¬ 
tended to any collection of elements for which reliable 
reference calculations can be performed), accurate (i.e., 
forces can be predicted to within 0.05 eV/A of the ref¬ 
erence calculations) and fast (i.e., a speed-up of over 8 
orders of magnitude with respect to the corresponding 
quantum mechanics based simulations may be expected). 
A practical scheme that exploits the rapid high-fidelity 
force prediction capability within a materials simulation 
framework is presented, and demonstrated for A1 in sev¬ 
eral configurational environments and dynamical situa¬ 
tions that go well beyond the reaches of conventional first 
principles simulations. Pathways to extend this concept 
to handle multi-elemental systems are also proposed. 

Central to this development is a robust scheme to nu¬ 
merically and simply represent, or fingerprint , the atomic 
environments. Such a fingerprint should differentiate dis¬ 
similar configurations with adequate accuracy, and be 
invariant to transformations of the environment such as 
translation, rotation and permutation of like elements. 
While several such prescriptions have been proposed in 
the past [12-18], the present objective, namely, mapping 
the vectorial force experienced by an atom to its configu¬ 
rational environment, places stringent constraints on the 
nature of the fingerprint. We argue that the following 
fingerprint function, may be used to accurately 

represent the k th component of the force on atom i: 
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r^ is the distance between atoms i and j, while r\- is a 
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FIG. 1. Comparison of the forces predicted using the ML force field with reference DFT results, for (a) the trained model (light 
blue) and the validation dataset (dark blue), (b) a test unit cell containing over 800 A1 atoms in the fee phase, and (c) a test 
unit cell containing over 160 atoms in a hypothetical bee phase. In (b) and (c), atoms were randomly perturbed from their 
equilibrium positions. Insets show the distribution of the prediction errors leading to respectable mean absolute errors (MAE). 


scalar projection of this distance along component k. To 
determine the force on an atom, we require three such 
components along non-parallel directions. The parame¬ 
ter r] governs the extent of co-ordination around atom i 
that needs to be captured. The fingerprint is essentially 
a spectrum of V k values corresponding to predetermined 
choices of r] values, i.e., V k is defined in an 77 -grid. The 
diminishing influence of faraway atoms is handled by a 


damping function, /(r^-) = 0.5 cos + 1 


The 


summation in Eq. 1 runs over all neighboring atoms 
within an arbitrarily large cutoff distance R c (8 A, in the 
present work). By construction, the fingerprint will lead 
to symmetry-adapted forces. For instance, an atom in a 
centro-symmetric position will lead to a fingerprint with 
all zero values (and should correspond to a zero force). 

The next step is to map the fingerprints to appropri¬ 
ate force components. Here, we have adopted the ker¬ 
nel ridge regression (KRR) method, capable of handling 
complex non-linear relationships [12, 17, 18]. The KRR 
method works on the principle of similarity. By compar¬ 
ing an atom’s fingerprint, V k (r])^ with a set of reference 
cases, an interpolative prediction of the k th component 
of the force (F k ) can be made, and is given by 


F t = ^2 oif exp 
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t labels each reference atomic environment, and V k {j]) 
is its corresponding fingerprint. || V k {r]) — V^( 77 )|| is the 
Euclidean distance between the two atomic fingerprints, 
though other distance metrics can be used. a t s and a 
are the weight coefficients and length scale parameter, 
respectively. The optimal values for a t s and cr are deter¬ 
mined during the training phase, with the help of cross- 
validation and regularization methods [ 12 , 17, 19, 20]. 

Using the above prescribed framework, a ML force 
field for A1 has been developed using a plethora of ref¬ 


erence atomic environments accumulated from density 
functional theory (DFT) based MD runs at various tem¬ 
peratures using the Vienna ab initio simulation package 
[21-24] (other means may also be used to generate the 
reference data, as long as they satisfy prescribed demands 
on accuracy of the atomic forces). To ensure a diverse set 
of reference cases, A1 in different geometric arrangements 
were considered (but each one with just a few tens of 
atoms per repeating unit cell), including defect-free bulk 
in the face centered cubic (fee) phase, bulk fee phase with 
vacancy, clean ( 111 ) surface, and the ( 111 ) surface with 
adatom, resulting in over 100,000 atomic environments 
[12]. Interestingly, a random set of 1000 atomic environ¬ 
ments drawn from the accumulated environments proved 
sufficient to construct an accurate interpolative force pre¬ 
diction model. Figure 1(a) compares the predicted forces 
with the DFT forces (including the error distribution in 
the inset) for all accumulated configurations, i.e., those 
used in the training phase and the remaining configura¬ 
tions whose results were used for validation. The mean 
absolute error (MAE) of the prediction model was 0.03 
eV/A, of the order of the expected chemical and numeri¬ 
cal accuracy of the reference DFT calculations. Needless 
to say, in addition to providing a pathway to accurately 
predict atomic forces, this procedure is also extremely 
expedient; it scales linearly with system size, and can be 
well over 8 orders of magnitude faster than a typical DFT 
calculation. 

An immediate (and straight-forward) application of 
this fast high-fidelity capability to predict atomic forces 
is geometry optimization, including the prediction of po¬ 
tential energy minima and saddle points. Simulations in¬ 
volving hundreds of thousands of atoms (i.e., cases that 
are beyond the reaches of present day DFT computa¬ 
tions) can be handled, provided the chemical environ¬ 
ments encountered during the course of such optimiza¬ 
tions are included in the force field. In order to under- 









































3 




900 800 700 600 500 300 200 150 100 50 

^ Temperature ITk] 



ReactionCoordinate 


FIG. 2. Arrhenius plots for (a) vacancy migration in bulk A1 and (b) adatom diffusion on the A1 (111) surface. For each 
temperature, the MD simulation time was extended so as to allow at least 50 hopping events (thus allowing estimation of an 
average hop rate, and the indicated error bar). A linear fit (solid red line) was used to determine the dynamic activation energy 
(ML E a ), and is compared with the static DFT activation energy (DFT E a ). (c) For the vacancy migration in bulk Al, the 
DFT potential energy along the migration trajectory (symbols and dashed line), and the corresponding energy obtained via an 
integration of the ML forces along the reaction coordinate (solid line). 


stand the limits of the constructed ML force field for Al 
within the context of such simulations, a few tests were 
performed. The first one involved a large unit cell con¬ 
taining over 800 Al atoms in the fee phase along with Al 
vacancies. Atoms were randomly perturbed, and the ML 
force field was used to optimize this perturbed structure. 
The correct equilibrium geometry was recovered, as as¬ 
certained by a separate DFT calculation starting with 
the same perturbed system. A video of this optimiza¬ 
tion is included in the Supplemental Information. Figure 
1(b) compares the predicted forces with the DFT forces 
for the initial perturbed geometry. Much larger unit cells 
could be considered using the ML force field but we re¬ 
strict ourselves to modest sizes in this discussion as we 
are constrained by the inability of DFT to provide vali¬ 
dation for truly large unit cells. 

As a second geometry optimization example, a 160 
atom unit cell of Al in a hypothetical body centered cubic 
(bee) phase was considered. Once again, the atoms were 
perturbed randomly, followed by geometry optimization. 
Figure 1(c) captures the performance of the force field 
for the starting geometry. Given that the bee phase was 
never used in the training phase during the force field 
creation, we would expect that forces on atoms in such 
an environment will be difficult to predict. Surprisingly, 
going by the rather small force error distribution (compa¬ 
rable to the 800-atom fee example), we conclude that the 
current choice of fingerprints allows us to effectively cap¬ 
ture diverse chemical environments in a versatile manner. 

Next, we consider non-zero temperature dynamical sit¬ 
uations. For the force prescription to correctly capture 
dynamic processes with high-fidelity, ergodicity has to 
be preserved. In other words, the average behavior and 
time scales of elementary steps or processes should be 
correctly represented during a MD simulation using the 
force field. As a first example, we consider the diffusion 


of an Al vacancy in bulk Al, using a unit cell contain¬ 
ing 32 Al sites and an Al vacancy. MD simulations were 
performed at 9 temperatures in the 500-900 K range for 
times up to 5 ns, with a timestep of 0.5 fs. By observing 
the dynamics of the vacancy, the average rate constant 
( k ) for the migration process at each temperature was 
determined, k is given as 1 / thop , where thop is the aver¬ 
age time taken for a vacancy to migrate to a neighboring 
site. To ensure that sufficient statistics are collected, k 
was averaged over 50 such hop events at each tempera¬ 
ture. Figure 2(a) shows an Arrhenius plot of k versus the 
reciprocal temperature, whose slope yields the activation 
energy (E a ) for Al vacancy migration to be 0.49 eV. The 
corresponding DFT value for a similar, but static, migra¬ 
tion process was determined to be 0.59 eV (c.f., Figure 
2(c)). Barrier “softening” is expected under dynamical 
conditions, relative to the results of static calculations in 
which entropic effects are neglected [25, 26]. 

Another elementary process we considered was the self¬ 
diffusion of an Al adatom on the Al (111) surface, using 
a 6 A x 6 A surface unit cell containing a 4-layer thick Al 
slab. Similar to the Al vacancy example, by monitoring 
the dynamics of the adatom across a temperature range 
of 50-300 K, an E a of 0.03 eV was predicted, as shown 
in Figure 2(b), whilst a static DFT calculation yielded 
an E a of 0.04 eV. A video of the adatom migration MD 
simulation is included in the Supplemental Information. 

Both dynamical diffusion scenarios considered lead to 
the correct Arrhenius behavior indicating that the under¬ 
lying physics is properly captured in the ML force field 
based MD simulations. Moreover, although the force field 
is aimed at directly predicting atomic forces, potential 
energy differences for elementary steps may be obtained 
by integrating the forces along a suitable reaction coordi¬ 
nate. Figure 2(c), for instance, portrays the DFT energy 
profile along the Al vacancy migration pathway in bulk 
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Al, as well as the corresponding energy determined by in¬ 
tegrating the forces predicted by the ML force field. The 
close agreement between the two energies is self-evident, 
indicating that energies corresponding to critical parts 
of a trajectory may indeed be obtained from the forces 
through integration. 

Lastly, we evaluate the prospect of how well thermal 
behavior of materials can be simulated using the force- 
based framework. In particular, we focus on the vibra¬ 
tional (or phonon) density of states (DOS), which has 
to be properly captured to allow for accurate calcula¬ 
tions of thermodynamic quantities, thermal expansion, 
thermal conductivity, etc. Figure 3(a) shows the phonon 
band structure as determined using the ML force field 
and using DFT, and in both cases, the finite displacement 
method was used [27]. Figure 3(b) shows the correspond¬ 
ing DOS, as well as the DOS computed using the Fourier 
transform of the velocity autocorrelation obtained from 
a MD simulation [28]. This latter approach implicitly in¬ 
cludes anharmonicity to all orders (the first method, in 
contrast, includes just the harmonic part). The MD sim¬ 
ulation involved a 864 atom unit cell, and a simulation 
time of 5 ps at 700 K. Excellent agreement of the ML 
force field result with the reference DFT calculations can 
be seen. The deviations of the DOS computed using MD 
simulations relative to that obtained using the finite dis¬ 
placement scheme (especially at high frequencies) may be 
attributed to non-zero anharmonic effects. The DOS can 
be utilized to determine thermodynamic properties such 
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FIG. 3. (a) Phonon band structure, (b) phonon density of 
states (DOS), and (c) Helmholtz free energy and constant vol¬ 
ume heat capacity computed using the ML force field (solid 
lines) and DFT (dashed lines). The phonon band structure 
and DOS were computed using the finite atomic displacement 
method. Also included in (b) are the DOS results obtained 
from the Fourier transform of the velocity autocorrelation 
function (solid cyan hatched fill). 


as the Helmholtz free energy and the constant volume 
heat capacity. These properties, as a function of temper¬ 
ature, are compared with the corresponding DFT results 
in Figure 3(c). The ML force field and DFT results are 
nearly indistinguishable, indicating that even under the 
stringent test of small atomic perturbations encountered 
in these situations (as opposed to the larger length scale 
vacancy or adatom hops discussed earlier), the fidelity of 
the force prediction is preserved. 

A natural question that arises at this point is how this 
force field paradigm may be extended to include multiple 
elements. In a multi-elemental system, the fingerprint of 
an atom of a given element type may be constructed to 
have as many parts as the number of elements in the 
system. Each part would represent the arrangement of 
atoms of a particular elemental type around the reference 
atom. While this scheme requires further optimization, 
preliminary work shows significant promise. For two bi¬ 
nary systems, a-A^Os and monoclinic Hf 02 , the force 
prediction based on the concatenated multi-component 
fingerprint prescription rivals that for the elemental Al 
in quality. A parity plot comparing the predicted force 
with the corresponding reference DFT result for each el¬ 
ement type is shown in the Supplemental Information. 
Given such accuracies, extension of the proposed concept 
to multielemental systems appears feasible. 

The discussion thus far has provided an expose of 
materials simulation examples that can benefit enor¬ 
mously through a capability to directly and rapidly pre¬ 
dict atomic forces with demonstrable verisimilitude. This 
capability learns from past reference quantum mechan¬ 
ical calculations, but can access length-scales and time- 
scales significantly beyond the reaches of purely quantum 
mechanics based simulations (while preserving accuracy). 
Examples of phenomena that can potentially be studied 
include transport (thermal and mass), phase transforma¬ 
tions and chemical reactions, mechanical behavior, ma¬ 
terials degradation and failure, etc., all within the frame¬ 
work of reality-mimicking non-zero temperature dynam¬ 
ical simulations. Widespread use of the proposed class 
of learning-based force fields will require attending to a 
few critical matters. These include: (i) creation of an 
initial compact training set of reference atomic environ¬ 
ments appropriate for a particular materials application; 
and, (ii) development of a capability to recognize a truly 
new atomic environment when such is encountered dur¬ 
ing the course of a simulation. The latter aspect is critical 
to evaluating when the force field is expected to fail, and, 
as importantly, to supplement the initial training set so 
as to make the force prediction scheme adapt, evolve and 
continuously improve with time. Nevertheless, these hur¬ 
dles have been encountered, and addressed, in the past in 
many “big data” situations [2-6]. Hence, there is reason 
for (cautious) optimism in the present context of high- 
fidelity, adaptive and generalizable atomic force fields. 

This work was supported financially by a grant from 
the Office of Naval Research (N00014-14-1-0098). The 
authors would like to acknowledge helpful discussions 






5 


with K. B. Lipkowitz, G. Pilania, T. D. Huan, A. 
Mannodi-Kanakkithodi and A. Dongare. Partial com¬ 
putational support through a Extreme Science and En¬ 
gineering Discovery Environment (XSEDE) allocation is 


also acknowledged. All calculations were performed in 
the pythonic environment, with the atomic simulation 
environment, pwtools and mlpy modules [29-31]. 


[1] R. P. Feynman, Phys. Rev. 56, 340 (1939). 

[2] J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, 
M. S. Smolinski, and L. Brilliant, Nature 457, 1012 
(2009). 

[3] H. Choi and H. Varian, Econ. Rec. 88, 2 (2012). 

[4] S. E. Hampton, C. A. Strasser, J. J. Tewksbury, W. K. 
Gram, A. E. Budden, A. L. Batcheller, C. S. Duke, and 
J. H. Porter, Front. Ecol. Environ. 11, 156 (2013). 

[5] M. M. Gobble, Res. Technol. Manage. 56, 64 (2013). 

[6] P. T. Metaxas and E. Mustafaraj, Science 338, 472 

( 2012 ). 

[7] R. Ramprasad, in Reviews in Computational Chemistry , 
edited by K. B. Lipkowitz (Wiley, 2010). 

[8] L. M. Ghiringhelli, J. Vybiral, S. V. Levchenko, C. Draxl, 
and M. Scheffler, Phys. Rev. Lett. 114, 105503 (2015). 

[9] J. Behler, Phys. Chem. Chem. Phys. 13, 17930 (2011). 

[10] S. Lorenz, A. Gro, and M. Scheffler, Chem. Phys. Lett. 
395, 210 (2004). 

[11] A. P. Bartok, M. C. Payne, R. Kondor, and G. Csanyi, 
Phys. Rev. Lett. 104, 136403 (2010). 

[12] V. Botu and R. Ramprasad, Int. J. Quantum Chem. 
(2014), 10.1002/qua.24836. 

[13] Z. Li, J. R. Kermode, and A. De Vita, Phys. Rev. Lett. 
114, 096405 (2015). 

[14] J. Behler, J. Chem. Phys. 134, 074106 (2011). 

[15] L. Yang, S. Dacek, and G. Ceder, Phys. Rev. B 90, 
054102 (2014). 

[16] A. P. Bartok, R. Kondor, and G. Csanyi, Phys. Rev. B 
87, 184115 (2013). 


[17] M. Rupp, A. Tkatchenko, K. R. Muller, and O. A. von 
Lilienfeld, Phys. Rev. Lett. 108, 058301 (2012). 

[18] G. Pilania, C. Wang, X. Jiang, S. Rajasekaran, and 
R. Ramprasad, Sci. Rep. 3, 2810 (2013). 

[19] K. Hansen, G. Montavon, F. Biegler, S. Fazil, M. Rupp, 
M. Scheffler, O. A. von Lilienfeld, A. Tkatchenko, and 
K. Muller, J. Chem. Theory Comput. 9, 3404 (2013). 

[20] T. Hastie, R. Tibshirani, and J. Friedman, The Elements 
of Statistical Learning: Data Mining, Inference, and Pre¬ 
diction , 2nd ed. (Springer, New York, 2009). 

[21] G. Kresse and J. Furthmuller, Phys. Rev. B 54, 11169 
(1996). 

[22] G. Kresse and D. Joubert, Phys. Rev. B 59, 1758 (1999). 

[23] J. P. Perdew, K. Burke, and Y. Wang, Phys. Rev. B 54, 
16533 (1996). 

[24] P. E. Blochl, Phys. Rev. B 50, 17953 (1994). 

[25] V. K. La Mer, J. Chem. Phys. 1, 289 (1933). 

[26] S. W. Benson, Thermochemical kinetics: methods for the 
estimation of thermochemical data and rate parameters , 
2nd ed. (Wiley, New York, 1976). 

[27] D. Alfe, Comp. Phys. Comm. 180, 2622 (2009). 

[28] J. M. Dickey and A. Paskin, Phys. Rev. 188, 1407 (1969). 

[29] S. R. Bahn and K. W. Jacobsen, Comput. Sci. Eng. 4, 
56 (2002). 

[30] S. Schmerler, “pwtools,” http://elcorto.bitbucket. 
org/pwtools/index.html (2015). 

[31] D. Albanese, R. Visintainer, S. Merler, S. Riccadonna, 
G. Jurman, and C. Furlanello, “mlpy: Machine learning 
python,” (2012), arXiv:1202.6548. 



